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ABSTRACT 


Effective stewardship of data is a critical precursor to making data FAIR. The goal of this paper is to bring 
an overview of current state of the art of data management and data stewardship planning solutions (DMP). 
We begin by arguing why data management is an important vehicle supporting adoption and implementation 
of the FAIR principles, we describe the background, context and historical development, as well as major 
driving forces, being research initiatives and funders. Then we provide an overview of the current leading 
DMP tools in the form of a table presenting the key characteristics. Next, we elaborate on emerging common 
standards for DMPs, especially the topic of machine-actionable DMPs. As sound DMP is not only a precursor 
of FAIR data stewardship, but also an integral part of it, we discuss its positioning in the emerging FAIR tools 
ecosystem. Capacity building and training activities are an important ingredient in the whole effort. Although 
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not being the primary goal of this paper, we touch also the topic of research workforce support, as tools can 
be just as much effective as their users are competent to use them properly. We conclude by discussing the 
relations of DMP to FAIR principles, as there are other important connections than just being a precursor. 


1. INTRODUCTION 


Effective stewardship of data is a critical precursor to making data FAIR, which is why researchers should 
develop Data Management Plans (DMPs) from the early stages of the research. It is obviously desirable to 
share data wherever possible. This requires the necessary permissions to be obtained? (either via consent 
agreements or in third-party data), the choice of appropriate formats and standards, and rich documentation 
to ensure data are meaningful to other stakeholders including machines. 


This paper will cover the convergence in data policy, tools and standards for DMPs, highlighting 
opportunities to facilitate the planning process and make better use of the information gathered. 


1.1. Convergence: finding the common ground 


DMPs are becoming commonplace across the globe. Expectations have been in place by several UK and 
US research funders for well over a decade?, and an increasing number of governments, funding agencies 
and research organisations are releasing expectations that either require or encourage the development of 
data plans [3]. 


Despite the fact that many funders have individual templates and guidelines for DMPs, there is considerable 
alignment in the content. The Digital Curation Centre (DCC) analysed the different UK and international 
requirements to agree on a set of DMP themes which represented the main aspects addressed [4]. These 
cover topics such as "data formats", "ethics", "data sharing" and "preservation". The themes are used in 
DMPonline to allow questions and guidance to be associated. In 2018, the California Digital Library and 
the Digital Curation Centre revisited this exercise to apply it to the many new US funders templates and to 
consult with the wider Research Data Management (RDM) community on convergence. This resulted in 14 
themes which act as a baseline for expectations and were used as an initial input to inspire the RDA 


Working Group on Common Standards for DMPs [5]°. 


Convergence is becoming more important when projects are executed at multiple institutes and/or paid 
by multiple different funders. It becomes impractical to create data management plans for each institute 
and for each funder separately. The DMPRoadmap platform utilised in the DMPonline, DMPTool and other 
national DMP services addresses this by allowing local requirements to be added to existing funder templates 
to prevent researchers from having to write one DMP for their funder and another for their institution. 


° see the article 20 (p199) in this special issue. 

® For a history of UK funder policy dating back to 1996, see [1]. In the USA, the National Institutes of Health (NIH) has 
expected data sharing statements since 2003, see [2]. 

9 A comprehensive list of DMP templates is curated by FAIRharing.org, as explained in paper 15 (p151) in this special issue. 
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DMPRoadmap [6] is an open source codebase for a Data Management Planning tool, jointly managed by 
the Digital Curation Centre and The University of California Curation Center (UC3). It represents a large 
effort to converge on a single solution which married together the best features from earlier versions of 
DMPonline and DMPTool. 


Policy harmonisation can also be a solution to differing requirements and various groups such as UK 
Research and Innovation, Science Europe, the Belmont Forum, and the Research Data Alliance funders 
forum are promoting convergence. Science Europe ran a workshop in 2018 to bring together European 
stakeholders on RDM requirements and DMP. There are two strands of resulting activity: promoting policy 
harmonisation and agreeing on a common set of DMP requirements which can be extended by domain- 
specific expectations (DDPs — domain data protocols). Similarly the Belmont Forum, an international 
consortium of 29+ science funding agencies, has worked closely with data organizations such as RDA, 
CODATA and science publishers to develop and align its data and digital outputs management requirements 
template with FAIR and open data best practices. The Belmont Forum is integrating its open data requirements 
with its online proposal and review processes. 


Some research funders use the outcomes of DMP reviews to improve their guidelines. The Economic and 
Social Research Council (ESRC), Belmont Forum, Health Research Board (HRB) in Ireland and ZonMw, a 
medical science funder in the Netherlands, have already released several iterations of their guidelines for 
researchers with various adjustments to increase clarity and improve the quality of the responses given. 
Funders such as ZonMw, HRB and Wellcome Trust are exploring ways to enable easier monitoring of DMPs 
by providing more structured questions which can be automatically assessed and connecting DMPs to local 
grants systems?, [7]. Some designated data centres supported by National Environment Research Council 
(NERC) in UK and National Science Foundation (NSF) in US such as Woods Hole are also discussing with 
tool providers how DMPs can be aligned with local systems and support processes. 


2. DMP TOOLS 


Many DMP tools are available worldwide. The earliest date from 2010-2011, when DMPonline and 
DMPTool were launched in the UK and USA, respectively [8]. The Digital Curation Centre and California 
Digital Library which operate these services, converged on a joint open source community-led codebase 
called DMPRoadmap in 2018. This codebase is used in several international services including DMPAssistant 
in Canada, DMPTuuli in Finland, DMPOPiDOR in France, PGDonline in Spain and DEIC's version of 
DMPonline in Denmark. In recent years, many other new DMP tools have been released. Most provide 
functionality to create, share, export and review DMPs; however different aspects have been emphasised. 
Several focus on providing closed questions instead of freetext and underlying knowledge-bases to help 
prompt and guide the users. Others take a project or data set focus, and also link to other tools for data 
documentation and storage allocation to support implementation. Below is a summary of the main services, 
highlighting the differences in functionality and approaches to deployment and sustainability. 


9 See the article 17 (p171) in this special issue. 
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3. COMMON STANDARDS FOR DMPS 


Given the range of tools emerging and the opportunity to connect DMPs with other research systems 
and processes, greater coordination and adoption of standards to enable data exchange and interoperability 
is needed. A request to establish the DMP Common Standards working group [29] was articulated during 
the 9th Research Data Alliance (RDA) plenary meeting in Barcelona. The discussion was framed by a white 
paper by Simms et al. on machine-actionable data management plans (maDMPs) [30]. The white paper is 
based on outputs from the IDCC workshop held in Edinburgh in 2017 that gathered almost 50 participants 
from Africa, America, Australia, and Europe. It describes eight community use cases which articulate 
consensus about the need for a common standard for maDMPs (where machine actionable is defined as 
"information that is structured in a consistent way so that machines, or computers, can be programmed 
against the structure"). 


The RDA working group on DMP Common Standards has developed a common model [31] for machine- 
actionable DMPs that enables exchange of information between systems acting on behalf of stakeholders 
involved in the research life cycle, such as, researchers, funders, repository managers, ICT providers, 
librarians, etc. The group has also implemented prototype work-flows [32] to demonstrate how the machine- 
actionable DMPs can be implemented by connecting them to various systems, such as CRIS, repositories, 
or funder systems. 


The model is independent of specific funder requirements and provides a common set of concepts 
needed to represent DMP specific information in a majority of settings. Since it is meant as a format for 
exchange of information between systems, it is also independent of an internal software architecture 
adopting the common standard. Furthermore, the model can be serialised into different representations, for 
example, JSON, XML, OWL. In 2019, the maDMPs were summarised into 10 principles in [33]. 


As of March 2019, the RDA DMP Common Standards Working Group is documenting the model and is 
creating examples in JSON to facilitate adoption. It is also engaging with pilot users who wish to deploy 
the model. Those include tool providers, such as DMPonline, but also universities such as TU Delft. The 
next steps for DMP Common Standards is to develop further serialisations of the model, reach out to new 
pilot users and to maintain the model by incorporating feedback from its deployments. All DMP tool 
providers are encouraged to adopt the common data model to promote interoperability across services and 
reuse of information held in DMPs. 


4. DMP TOOLS IN THE CONTEXT OF FAIR TOOLS ECOSYSTEM 


As data management encompasses the whole life cycle of data, all tools related to achieving FAIRness 
are relevant, in particular the discovery of existing data sets, evaluating FAlRness and publishing?. Tight 


9 See the article 9 (p87) in this special issue. 
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integration between DMP and FAIR tools is not currently in place, at least not at an appropriate level of 
adoption and maturity. The first efforts have been formulated in the "The FAIR Funder Pilot Programme?, 
[7], which will attempt to use DMPs as an indication of the prospective FAIRness of data by connecting 
FAIR metrics to perform an automated assessment. DMPonline has also integrated the RDA Metadata 
Standards Directory to help researchers identify and adopt relevant standards and forthcoming tools such 
as OpenDMP are intended to focus on API integrations to connect different tools in the research system. 
The DMP Common Standards work lays important cornerstones for interoperability in the tooling ecosystem. 


Another dimension of such an ecosystem is internationalisation. Individual DMP tools offer translations 
to various European and world languages; however FAIR through its focus on machine-actionability and 
strong semantics starts to show a way to perform language-neutral science — a prospect very interesting 
also for data management planning with possibilities not yet fully fathomed. 


5. RESEARCH WORKFORCE SUPPORT 


Capacity building activities are an important complement to data management tools and guidance. 
Training resources exist in a variety of formats to help researchers successfully prepare a DMP, and more 
importantly, to implement these plans. As data needs vary across domains, methods, and objectives of 
research, flexibility is an important capacity building consideration. The Belmont Forum has developed a 
Data Skills Curricula Framework to guide research teams and agencies in developing sustainable DMPs, 
putting together the components of a comprehensive program for data-enabled research, or to be used as 
a resource to discover the types of training that exist to develop a customised plan. The Framework 
emphasizes full path data management and role-specific approaches to help teams identify who needs 
which skills, and when to turn to a data professional for assistance [34]. 


The RDA/CODATA Research Data Science Schools [35] which have been running since 2016 to equip 
early career researchers with core data handling, visualisation, computational infrastructure and open 
science skills, are now exploring parallel data stewardship courses supported by the FAIRsFAIR project [36] 
where researchers and data stewards will be co-taught. The curriculum will promote collaboration and the 
range of skills and inputs needed to effectively manage and share FAIR data. Resources such as the Data 
Management Training Clearinghouse, the Belmont Forum e-Infrastructures and Data Management Toolkit 
and the FOSTER Open Science portal have compiled globally-oriented free online training materials, many 
of which have been collaboratively developed in line with the principles of FAIR and Open Educational 
Resources. 


In ELIXIR, the “Towards Data Stewardship in ELIXIR: Training & Portal" Implementation Study running in 
years 2018—2019 has brought advancements both in Data Stewardship Wizard tool, as well as the training 
materials on data stewardship curriculum. 


9 See the article 17 (p171) in this special issue. 
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6. CONCLUSIONS AND DISCUSSION 


The following points summarise the relation between the FAIR principles and data management planning 
and the respective tool. 


1. A sound and elaborate data management plan is a necessary precursor to making data FAIR. 

2. DMP tools can help becoming more FAIR by giving a feedback — FAIR metrics calculations indicating 
the assumed FAlRness of data if the DMP is followed (e.g. Data Stewardship Wizard [17]). 

3. DMPs can themselves be FAIR — The RDA DMP Common Standards Working Group has delivered a 
prototype model which should now be implemented by the growing number of DMP tool providers. 
If we have a standard expression for the content within DMPs, we will be able to exchange information 
more easily with the wider ecosystem of FAIR services and provide greater value to researchers, 
funders and service providers using DMPs. This covers "I" and “R”. “F” and “A” still need to be 
achieved by storing DMPs into suitable repositories, which does not happen in general, yet. A question 
is, of course, what information may be needed to be FAIR and what can be shared (DMPs tend to be 
inherently confidential) — a decision the community has to have, yet. 


Convergence in research data policy and the adoption of standards for DMP tools is desirable to help 
clarify the landscape and enable interoperability. We are unlikely to achieve a single DMP template or one 
tool, and neither is that desirable. DMPs should be tailored to the context in which research is conducted. 
For some ethical issues will be prevalent, while for others it will be challenges of scale or timely data 
sharing. Also, even in the same human language, different research disciplines may use different words to 
express data management topics; using the right terminology is critical in order to facilitate usage of the 
tools and templates. 
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